Combination of standard and throat microphones for robust speech recognition in highly noisy environments

نویسندگان

Martin Graciarena

Federico Cesari

Horacio Franco

Gregory K. Myers

Cregg Cowan

Victor Abrash

چکیده

We present a method to combine standard and throat microphone signals for noise-robust speech recognition. Our approach is to extend the probabilistic optimum filter (POF) mapping algorithm to estimate standard microphone clean speech feature vectors from both microphones’ noisy speech feature vectors. We tested the proposed approach in two noisy speech recognition tasks. In the first task we used a large-vocabulary continuous speech recognition system and noisy speech using either artificially added noise or noise recorded in an M1 tank cockpit. In the second task we used a real-time system and noisy speech recorded in a highly noisy environment, inside a HMMWV military vehicle. A noisecanceling microphone and a throat microphone were used in this task. Because of the highly adverse conditions in this second task we propose an extension of the combined microphone approach, which takes into account the level of noise captured by the throat microphone. The combined microphone approach significantly outperforms the single microphone approach in all the recognition experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Speech intelligibility in noise using throat and acoustic microphones.

INTRODUCTION Helicopter cockpits are very noisy and this noise must be reduced for effective communication. The standard U.S. Army aviation helmet is equipped with a noise-canceling acoustic microphone, but some ambient noise still is transmitted. Throat microphones are not sensitive to air molecule vibrations and thus, transmittal of ambient noise is reduced. It is possible that throat microph...

متن کامل

A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

A new speech enhancement scheme is presented integrating spatial and temporal signal processing methods for robust speech recognition in noisy environments. The scheme first separates spatially localized point sources from noisy speech signals recorded by two microphones. Blind source separation algorithms assuming no a priori knowledge about the sources involved are applied in this spatial pro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Combination of standard and throat microphones for robust speech recognition in highly noisy environments

نویسندگان

چکیده

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Improving the performance of MFCC for Persian robust speech recognition

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Speech intelligibility in noise using throat and acoustic microphones.

A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments

عنوان ژورنال:

اشتراک گذاری